{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MNIST with SciKit-Learn and skorch\n",
    "\n",
    "This notebooks shows how to define and train a simple Neural-Network with PyTorch and use it via skorch with SciKit-Learn."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.datasets import fetch_mldata\n",
    "from sklearn.model_selection import train_test_split\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading Data\n",
    "Using SciKit-Learns ```fetch_mldata``` to load MNIST data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "mnist = fetch_mldata('MNIST original')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'COL_NAMES': ['label', 'data'],\n",
       " 'DESCR': 'mldata.org dataset: mnist-original',\n",
       " 'data': array([[0, 0, 0, ..., 0, 0, 0],\n",
       "        [0, 0, 0, ..., 0, 0, 0],\n",
       "        [0, 0, 0, ..., 0, 0, 0],\n",
       "        ..., \n",
       "        [0, 0, 0, ..., 0, 0, 0],\n",
       "        [0, 0, 0, ..., 0, 0, 0],\n",
       "        [0, 0, 0, ..., 0, 0, 0]], dtype=uint8),\n",
       " 'target': array([ 0.,  0.,  0., ...,  9.,  9.,  9.])}"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mnist"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(70000, 784)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mnist.data.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Preprocessing Data\n",
    "\n",
    "Each image of the MNIST dataset is encoded in a 784 dimensional vector, representing a 28 x 28 pixel image. Each pixel has a value between 0 and 255, corresponding to the grey-value of a pixel.<br />\n",
    "The above ```featch_mldata``` method to load MNIST returns ```data``` and ```target``` as ```uint8``` which we convert to ```float32``` and ```int64``` respectively."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = mnist.data.astype('float32')\n",
    "y = mnist.target.astype('int64')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we will use ReLU as activation in combination with softmax over the output layer, we need to scale `X` down. An often use range is [0, 1]."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "X /= 255.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.0, 1.0)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X.min(), X.max()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note: data is not normalized."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert(X_train.shape[0] + X_test.shape[0] == mnist.data.shape[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "((52500, 784), (52500,))"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_train.shape, y_train.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build Neural Network with Torch\n",
    "Simple, fully connected neural network with one hidden layer. Input layer has 784 dimensions (28x28), hidden layer has 98 (= 784 / 8) and output layer 10 neurons, representing digits 0 - 9."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from torch import nn\n",
    "import torch.nn.functional as F"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "torch.manual_seed(0);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "mnist_dim = X.shape[1]\n",
    "hidden_dim = int(mnist_dim/8)\n",
    "output_dim = len(np.unique(mnist.target))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(784, 98, 10)"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mnist_dim, hidden_dim, output_dim"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A Neural network in PyTorch's framework."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "class ClassifierModule(nn.Module):\n",
    "    def __init__(\n",
    "            self,\n",
    "            input_dim=mnist_dim,\n",
    "            hidden_dim=hidden_dim,\n",
    "            output_dim=output_dim,\n",
    "            dropout=0.5,\n",
    "    ):\n",
    "        super(ClassifierModule, self).__init__()\n",
    "        self.dropout = nn.Dropout(dropout)\n",
    "\n",
    "        self.hidden = nn.Linear(input_dim, hidden_dim)\n",
    "        self.output = nn.Linear(hidden_dim, output_dim)\n",
    "\n",
    "    def forward(self, X, **kwargs):\n",
    "        X = F.relu(self.hidden(X))\n",
    "        X = self.dropout(X)\n",
    "        X = F.softmax(self.output(X), dim=-1)\n",
    "        return X"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Skorch allows to use PyTorch's networks in the SciKit-Learn setting."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "from skorch.net import NeuralNetClassifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "net = NeuralNetClassifier(\n",
    "    ClassifierModule,\n",
    "    max_epochs=20,\n",
    "    lr=0.1,\n",
    "    # use_cuda=True,  # uncomment this to train with CUDA\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  epoch    train_loss    valid_acc    valid_loss     dur\n",
      "-------  ------------  -----------  ------------  ------\n",
      "      1        \u001b[36m0.8343\u001b[0m       \u001b[32m0.8983\u001b[0m        \u001b[35m0.3821\u001b[0m  1.4131\n",
      "      2        \u001b[36m0.4338\u001b[0m       \u001b[32m0.9193\u001b[0m        \u001b[35m0.2961\u001b[0m  1.4066\n",
      "      3        \u001b[36m0.3625\u001b[0m       \u001b[32m0.9319\u001b[0m        \u001b[35m0.2424\u001b[0m  1.3839\n",
      "      4        \u001b[36m0.3275\u001b[0m       \u001b[32m0.9382\u001b[0m        \u001b[35m0.2199\u001b[0m  1.3872\n",
      "      5        \u001b[36m0.2967\u001b[0m       \u001b[32m0.9435\u001b[0m        \u001b[35m0.1989\u001b[0m  1.4129\n",
      "      6        \u001b[36m0.2800\u001b[0m       \u001b[32m0.9467\u001b[0m        \u001b[35m0.1835\u001b[0m  1.2378\n",
      "      7        \u001b[36m0.2615\u001b[0m       \u001b[32m0.9513\u001b[0m        \u001b[35m0.1695\u001b[0m  1.1722\n",
      "      8        \u001b[36m0.2467\u001b[0m       \u001b[32m0.9534\u001b[0m        \u001b[35m0.1612\u001b[0m  1.4182\n",
      "      9        \u001b[36m0.2385\u001b[0m       \u001b[32m0.9551\u001b[0m        \u001b[35m0.1533\u001b[0m  1.4431\n",
      "     10        \u001b[36m0.2276\u001b[0m       \u001b[32m0.9560\u001b[0m        \u001b[35m0.1471\u001b[0m  1.3997\n",
      "     11        \u001b[36m0.2201\u001b[0m       \u001b[32m0.9573\u001b[0m        \u001b[35m0.1434\u001b[0m  1.3977\n",
      "     12        \u001b[36m0.2118\u001b[0m       \u001b[32m0.9579\u001b[0m        \u001b[35m0.1380\u001b[0m  1.4252\n",
      "     13        \u001b[36m0.2070\u001b[0m       \u001b[32m0.9613\u001b[0m        \u001b[35m0.1335\u001b[0m  1.4033\n",
      "     14        \u001b[36m0.1994\u001b[0m       0.9609        \u001b[35m0.1316\u001b[0m  1.4070\n",
      "     15        \u001b[36m0.1979\u001b[0m       \u001b[32m0.9618\u001b[0m        \u001b[35m0.1257\u001b[0m  1.4061\n",
      "     16        \u001b[36m0.1915\u001b[0m       \u001b[32m0.9638\u001b[0m        \u001b[35m0.1228\u001b[0m  1.4022\n",
      "     17        \u001b[36m0.1881\u001b[0m       \u001b[32m0.9651\u001b[0m        \u001b[35m0.1189\u001b[0m  1.4358\n",
      "     18        \u001b[36m0.1835\u001b[0m       0.9651        \u001b[35m0.1167\u001b[0m  1.4076\n",
      "     19        \u001b[36m0.1769\u001b[0m       \u001b[32m0.9654\u001b[0m        \u001b[35m0.1150\u001b[0m  1.4259\n",
      "     20        \u001b[36m0.1717\u001b[0m       \u001b[32m0.9665\u001b[0m        \u001b[35m0.1129\u001b[0m  1.4119\n"
     ]
    }
   ],
   "source": [
    "net.fit(X_train, y_train);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prediction"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "predicted = net.predict(X_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.96325714285714281"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.mean(predicted == y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "An accuracy of nearly 96% for a network with only one hidden layer is not too bad"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Convolutional Network\n",
    "PyTorch expects a 4 dimensional tensor as input for its 2D convolution layer. The dimensions represent:\n",
    "* Batch size\n",
    "* Number of channel\n",
    "* Height\n",
    "* Width\n",
    "\n",
    "As initial batch size the number of examples needs to be provided. MNIST data has only one channel. As stated above, each MNIST vector represents a 28x28 pixel image. Hence, the resulting shape for PyTorch tensor needs to be (x, 1, 28, 28). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "XCnn = X.reshape(-1, 1, 28, 28)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(70000, 1, 28, 28)"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "XCnn.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "XCnn_train, XCnn_test, y_train, y_test = train_test_split(XCnn, y, test_size=0.25, random_state=42)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "((52500, 1, 28, 28), (52500,))"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "XCnn_train.shape, y_train.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Cnn(nn.Module):\n",
    "    def __init__(self):\n",
    "        super(Cnn, self).__init__()\n",
    "        self.conv1 = nn.Conv2d(1, 32, kernel_size=3)\n",
    "        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)\n",
    "        self.conv2_drop = nn.Dropout2d()\n",
    "        self.fc1 = nn.Linear(1600, 128) # 1600 = number channels * width * height\n",
    "        self.fc2 = nn.Linear(128, 10)\n",
    "\n",
    "    def forward(self, x):\n",
    "        x = F.relu(F.max_pool2d(self.conv1(x), 2))\n",
    "        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))\n",
    "        x = x.view(-1, x.size(1) * x.size(2) * x.size(3)) # flatten over channel, height and width = 1600\n",
    "        x = F.relu(self.fc1(x))\n",
    "        x = F.dropout(x, training=self.training)\n",
    "        x = self.fc2(x)\n",
    "        x = F.softmax(x, dim=-1)\n",
    "        return x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "cnn = NeuralNetClassifier(\n",
    "    Cnn,\n",
    "    max_epochs=15,\n",
    "    lr=1,\n",
    "    optimizer=torch.optim.Adadelta,\n",
    "    # use_cuda=True,  # uncomment this to train with CUDA\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  epoch    train_loss    valid_acc    valid_loss      dur\n",
      "-------  ------------  -----------  ------------  -------\n",
      "      1        \u001b[36m0.4168\u001b[0m       \u001b[32m0.9748\u001b[0m        \u001b[35m0.0836\u001b[0m  18.2292\n",
      "      2        \u001b[36m0.1455\u001b[0m       \u001b[32m0.9823\u001b[0m        \u001b[35m0.0594\u001b[0m  18.0943\n",
      "      3        \u001b[36m0.1129\u001b[0m       \u001b[32m0.9849\u001b[0m        \u001b[35m0.0503\u001b[0m  19.4160\n",
      "      4        \u001b[36m0.0940\u001b[0m       \u001b[32m0.9856\u001b[0m        \u001b[35m0.0433\u001b[0m  17.4486\n",
      "      5        \u001b[36m0.0836\u001b[0m       0.9855        0.0460  19.5266\n",
      "      6        \u001b[36m0.0788\u001b[0m       \u001b[32m0.9869\u001b[0m        \u001b[35m0.0379\u001b[0m  19.8825\n",
      "      7        \u001b[36m0.0681\u001b[0m       \u001b[32m0.9881\u001b[0m        \u001b[35m0.0369\u001b[0m  18.6277\n",
      "      8        \u001b[36m0.0662\u001b[0m       \u001b[32m0.9891\u001b[0m        \u001b[35m0.0356\u001b[0m  19.2907\n",
      "      9        \u001b[36m0.0630\u001b[0m       0.9879        \u001b[35m0.0340\u001b[0m  18.7650\n",
      "     10        \u001b[36m0.0575\u001b[0m       0.9890        \u001b[35m0.0324\u001b[0m  17.2828\n",
      "     11        \u001b[36m0.0563\u001b[0m       0.9886        0.0333  17.8509\n",
      "     12        \u001b[36m0.0523\u001b[0m       0.9881        0.0357  17.3866\n",
      "     13        \u001b[36m0.0516\u001b[0m       \u001b[32m0.9903\u001b[0m        0.0326  17.8570\n",
      "     14        \u001b[36m0.0462\u001b[0m       0.9901        \u001b[35m0.0320\u001b[0m  18.2953\n",
      "     15        0.0464       0.9897        \u001b[35m0.0313\u001b[0m  18.2699\n"
     ]
    }
   ],
   "source": [
    "cnn.fit(XCnn_train, y_train);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "cnn_pred = cnn.predict(XCnn_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.99102857142857148"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.mean(cnn_pred == y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "An accuracy of 99.1% should suffice for this example!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
